persistent k-means: stable data clustering algorithm based on k-means algorithm

Authors

rasool azimi

faculty of computer and information technology engineering, qazvin branch, islamic azad university, qazvin, iran hedieh sajedi

department of computer science, college of science, university of tehran, tehran, iran

abstract

identifying clusters or clustering is an important aspect of data analysis. it is the task of grouping a set of objects in such a way those objects in the same group/cluster are more similar in some sense or another. it is a main task of exploratory data mining, and a common technique for statistical data analysis this paper proposed an improved version of k-means algorithm, namely persistent k-means, which alters the convergence method of k-means algorithm to provide more accurate clustering results than the k-means algorithm and its variants by increasing the clusters’ coherence. persistent k-means uses an iterative approach to discover the best result for consecutive iterations of k-means algorithm.

Already have an account?login

similar resources

Persistent K-Means: Stable Data Clustering Algorithm Based on K-Means Algorithm

Identifying clusters or clustering is an important aspect of data analysis. It is the task of grouping a set of objects in such a way those objects in the same group/cluster are more similar in some sense or another. It is a main task of exploratory data mining, and a common technique for statistical data analysis This paper proposed an improved version of K-Means algorithm, namely Persistent K...

full text

Enhanced Clustering Based on K-means Clustering Algorithm and Proposed Genetic Algorithm with K-means Clustering

-In this paper targeted a variety of techniques, tactics and distinctive areas of the studies that are useful and marked because the crucial discipline of information mining technologies. The overall purpose of the system of statistics mining is to extract beneficial facts from a large set of information and changing it right into a shape that is comprehensible for in addition use. Clustering i...

full text

A Hybrid Data Clustering Algorithm Using Modified Krill Herd Algorithm and K-MEANS

Data clustering is the process of partitioning a set of data objects into meaning clusters or groups. Due to the vast usage of clustering algorithms in many fields, a lot of research is still going on to find the best and efficient clustering algorithm. K-means is simple and easy to implement, but it suffers from initialization of cluster center and hence trapped in local optimum. In this paper...

full text

Fast k-means algorithm clustering

k-means has recently been recognized as one of the best algorithms for clustering unsupervised data. Since k-means depends mainly on distance calculation between all data points and the centers, the time cost will be high when the size of the dataset is large (for example more than 500millions of points). We propose a two stage algorithm to reduce the time cost of distance calculation for huge ...

full text

Extending K-Means Clustering Algorithm

The K-Means algorithm for clustering has the drawback of always maintaining K clusters. This leads to ineffective handling of noisy data and outliers. Noisy data is defined as having little similarity with the closest cluster’s centroid. In K-Means a noisy data item is placed in the most similar cluster, despite this similarity is low relative to the similarity of other data items in the same c...

full text

Improved K-means Clustering Algorithm Based on Genetic Algorithm

Through comparison and analysis of clustering algorithms, this paper presents an improved Kmeans clustering algorithm. Using genetic algorithm to select the initial cluster centers, using Z-score to standardize data, and take a new method to evaluate cluster centers, all this reduce the affect of isolated points, and improve the accuracy of clustering. Experiments show that the algorithm to fin...

full text

My Resources

Save resource for easier access later

Save to my library Already added to my library

{@ msg_add @}

Journal title:

journal of computer and robotics

جلد ۷، شماره ۱، صفحات ۵۷-۶۶

Keywords

data mining clustering k means persistent k means

Hosted on Doprax cloud platform doprax.com